Skip to content

fix(ci): improve auto bm note narrative#893

Merged
phernandez merged 7 commits into
mainfrom
codex/improve-auto-bm-note-narrative
Jun 5, 2026
Merged

fix(ci): improve auto bm note narrative#893
phernandez merged 7 commits into
mainfrom
codex/improve-auto-bm-note-narrative

Conversation

@phernandez
Copy link
Copy Markdown
Member

Summary

  • enrich merged PR context with changed files, commits, and linked issue details when GITHUB_TOKEN is available
  • require narrative AgentSynthesis fields and render story/problem/solution/impact sections in project_update notes
  • update the checked-in and generated Auto BM prompt/schema guidance, including replacing the visible Why It Matters heading with Project Memory

Verification

  • BASIC_MEMORY_ENV=test uv run pytest tests/ci/test_project_updates.py tests/cli/test_ci_commands.py -q --no-cov
  • prompt renderer sanity check
  • just fast-check
  • git diff --check

Signed-off-by: phernandez <paul@basicmachines.co>
Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Claude finished @phernandez's task in 2m 0s —— View job


PR Review: fix(ci): improve auto bm note narrative

Tasks:

  • Gather context and understand the request
  • Read changed files
  • Review against team checklist
  • Post final review

Code Quality & Standards

  • Follows Basic Memory's coding conventions in CLAUDE.md
  • Python 3.12+ type annotations and async patterns — Pydantic v2 ConfigDict, field_validator, model_copy
  • SQLAlchemy 2.0 best practices — N/A (no DB changes)
  • FastAPI and Typer conventions followed
  • 100-character line length maintained
  • No commented-out code blocks

Testing & Documentation

  • Unit tests for new functions/methods
  • Integration tests — N/A (CI-layer code; unit tests are appropriate)
  • Edge cases covered (sparse payloads, missing repo, token absent)
  • 100% test coverage — new helpers covered; see one gap note below
  • Documentation updated (README.md, memory-ci-capture.md)
  • CLAUDE.md — no convention changes needed

Basic Memory Architecture

  • Local-first architecture preserved
  • No DB changes, no migrations needed
  • Auth boundary correctly maintained — GITHUB_TOKEN stays in collect step; BASIC_MEMORY_API_KEY stays in publish step
  • Idempotency logic unchanged

Security & Performance

  • No hardcoded secrets or credentials
  • Input validation defensive throughout (all helpers use isinstance guards)
  • Timeout on HTTP calls (timeout=20)
  • No sensitive data in notes or logs
  • GITHUB_TOKEN read from env only, never logged

Compatibility

  • Windows path compatibility — Path(file_path).parent.as_posix() in _note_write_target correctly emits forward slashes
  • No emojis or Unicode characters in console output

Findings

Minor: why_it_matters field name vs ## Project Memory heading

AgentSynthesis.why_it_matters (line 117) renders as ## Project Memory in build_project_update_note (line 586). The mismatch is harmless — the validator name in the schema stays why_it_matters — but without a comment a future reader may wonder if a rename was accidentally half-finished.

src/basic_memory/ci/project_updates.py:115-117 and 580-586 — a one-line comment on the field or the section entry would make the intent clear. Fix this →

Minor: linked_issue_details enriched but not rendered in note

_enrich_pull_request_context populates context.linked_issue_details (titles, bodies, state, URLs) but build_project_update_note never renders that list into the note. The prompt instructs Codex to use those details as evidence when synthesizing, so they do influence the agent output. But future readers of the raw context JSON can see the details while readers of the published note cannot. If that gap is intentional (details are synthesized into the narrative rather than appended verbatim), a comment in build_project_update_note near the source-links block would make it obvious. If it's an oversight, a ## Linked Issues section similar to _extend_list_section output would surface them.

Minor: _github_api_get_list pagination coverage

test_collect_enriches_pull_request_context_from_github_api returns a single-item list from fake_github_api_get, which means pagination never triggers. The fast path (len(payload) < 100 → return) is correct by inspection but lacks a test exercising the page-2 branch. Low risk for CI volumes, but worth noting for completeness.

Positive observations

  • The _required_text validator on the six narrative fields is a good guardrail — Codex structured output can return empty strings, and catching that at parse time is better than publishing a hollow note.
  • ConfigDict(extra="ignore") on AgentSynthesis is the right defensive move for structured-output consumers — lets Codex add speculative fields without breaking publish.
  • model_copy(update={...}) in _enrich_pull_request_context keeps ProjectUpdateContext effectively immutable; the context passed in is never mutated.
  • _body_excerpt truncating issue bodies to 2000 chars is a sensible cap before they hit the agent prompt.
  • changed_files_count is correctly preserved from the PR payload even when GITHUB_TOKEN is absent — the agent can still reference it even when the per-file list is unavailable.
  • The README explanation of the fail-fast behaviour for token-present enrichment failures is clear and matches the implementation.

Overall: Solid, well-tested expansion of the CI enrichment layer. The three findings above are all minor — the code is correct as-is. The main narrative sections (## Story, ## Problem Addressed, ## How The Change Solves It, ## Impact On The System, ## Project Memory) are a meaningful improvement over the previous flat note structure.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Claude finished @phernandez's task in 2m 11s —— View job


PR Review — fix(ci): improve auto bm note narrative

  • Gather context and understand PR scope
  • Read changed files
  • Review code quality and standards
  • Check testing coverage
  • Check architecture patterns
  • Check security and performance
  • Post final review summary

Summary

Solid, well-scoped PR. The enrichment layer (files + commits + linked issue details from GitHub API), the new required narrative fields on AgentSynthesis, the note section rename (## Why It Matters## Project Memory), and the updated prompt contract are all consistent and internally coherent. Code follows project conventions throughout.


Code Quality & Standards

Item Status
Python 3.12+ type annotations
Typer conventions followed
100-char line length
No commented-out code
No hardcoded secrets ✅ (env vars only)
No emojis/unicode in console output
Windows file path compat ✅ (as_posix() used in _note_write_target)

Observations

1. API failure propagation is intentional but worth confirming

_enrich_pull_request_context in project_updates.py:371 raises ValueError on any GitHub API error (HTTP error, network error). This propagates to collect_project_update_context → CLI collect → exit code 1, failing the CI step entirely.

The README explicitly documents this: "If the token is present and GitHub API enrichment fails, the Auto BM workflow fails fast instead of publishing a weak note."

That's a deliberate design call and matches the project's fail-fast philosophy. No action needed — just confirming this is intentional.

2. Missing test: API error path for enrichment

There's a test for the happy-path enrichment (test_collect_enriches_pull_request_context_from_github_api at tests/ci/test_project_updates.py:98) but no test for when GITHUB_TOKEN is present and the API call returns a 4xx/5xx. Given the README documents this behavior explicitly, a test like test_collect_fails_when_github_api_returns_error would complete the coverage contract for the new error path.

Fix this →

3. _github_api_get does not handle JSON decode errors from the API body

project_updates.py:285-286:

with urllib.request.urlopen(request, timeout=20) as response:
    payload = json.loads(response.read().decode("utf-8"))

If GitHub returns a non-JSON body (e.g. a 200 with a maintenance HTML page), this raises an undecorated json.JSONDecodeError, not a ValueError with context. The collect CLI command only catches ValueError, so this would surface as an unhandled exception traceback rather than a clean error message.

Low probability in practice (GitHub API is reliable), but a try/except json.JSONDecodeError wrapping with a raise ValueError(...) would be consistent with the rest of the module's error handling style.

Fix this →

4. changed_files_count and changed_files can diverge

ProjectUpdateContext has both changed_files_count (from the PR payload integer field) and changed_files (list populated by API enrichment). After enrichment, len(changed_files) may differ from changed_files_count if the GitHub API paginates or truncates (GitHub caps files at 3000 in the REST API). Not a bug — these are distinct concepts (payload count vs fetched detail list) — but a future reader might not notice the difference. The note renderer only uses the list, so no behavioral issue here.

5. AgentSynthesis.model_config = ConfigDict(extra="ignore") with required fields

project_updates.py:110: extra="ignore" is retained from before, which means the model silently drops unknown fields from Codex output. The new required fields (story, problem_addressed, solution, system_impact) are all validated via _required_text to be non-empty strings. If Codex returns an extra field (e.g. legacy title or intent from an older schema), it's dropped cleanly. ✅


Testing & Documentation

Item Status
Unit tests for new API enrichment ✅ (test_collect_enriches_pull_request_context_from_github_api)
Tests for required narrative fields ✅ (test_agent_synthesis_requires_delivery_narrative_fields, test_build_project_update_note_renders_story_sections)
## Project Memory / no ## Why It Matters test ✅ (test_build_project_update_note_renders_story_sections:419-420)
API error path test ⚠️ Missing (see observation #2)
README updated
memory-ci-capture.md updated and matches render_capture_prompt() ✅ (checked-in file matches rendered output)

Checklist

  • Follows Basic Memory coding conventions
  • Python 3.12+ type annotations and async patterns
  • N/A — no SQLAlchemy changes
  • FastAPI and Typer conventions followed
  • 100-character line length maintained
  • No commented-out code blocks
  • Unit tests for new functions/methods
  • N/A — no new MCP tools
  • [~] Test coverage for edge cases — API error path missing
  • [~] 100% test coverage — unverified for new json.JSONDecodeError path in _github_api_get
  • Documentation updated
  • N/A — CLAUDE.md conventions unchanged
  • N/A — no new MCP tools
  • N/A — no database changes
  • Preserves local-first architecture principles
  • N/A — no knowledge graph changes
  • N/A — no markdown file handling changes
  • No hardcoded secrets or credentials
  • Input validation for new models
  • Proper error handling (with one minor gap noted)
  • Performance considerations addressed (pagination handles >100 files)
  • No sensitive data in logs
  • Windows-compatible file path comparisons
  • No emojis or unicode in console/log output

Two minor items to consider: add a test for the API failure path (observation #2) and optionally harden the JSON decode path in _github_api_get (observation #3). Neither blocks merge.

Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Claude finished @phernandez's task in 2m 48s —— View job


PR Review: fix(ci): improve auto bm note narrative

Tasks:

  • Gather context and understand PR scope
  • Read CLAUDE.md for project guidelines
  • Read changed files
  • Review code quality and standards
  • Check testing coverage
  • Check architecture patterns
  • Check security and performance
  • Post final review summary

Summary

This PR enriches Auto BM's PR context with changed files, commits, and linked issue details from the GitHub API; makes the AgentSynthesis narrative fields (story, problem_addressed, solution, system_impact) required; and renames the rendered section ## Why It Matters## Project Memory. Changes are well-scoped and architecturally sound. A few issues with test isolation and coverage need attention before merge.


Team Checklist

Code Quality & Standards

  • Follows Basic Memory's coding conventions — section-header comments, fail-fast, no speculative getattr
  • Python 3.12+ type annotations throughout
  • SQLAlchemy / FastAPI — not applicable (CI module)
  • Typer conventions followed
  • 100-character line length maintained
  • No commented-out code blocks

Testing & Documentation

  • Unit tests added for all new helpers
  • Edge cases covered (sparse payloads, missing repo, non-merged PR, unsupported events)
  • README, prompt file, and schema notes all updated consistently
  • 100% coverage — see issues below
  • CLAUDE.md — no convention changes warranting update

Basic Memory Architecture

  • Atomic, composable design maintained
  • No database changes (CI-only module)
  • Local-first principles preserved; enrichment is opportunistic (no token → graceful no-op)
  • Note idempotency and identity fields come from ProjectUpdateContext, not AgentSynthesis

Security & Performance

  • GITHUB_TOKEN read from env, never logged or committed
  • All API responses validated with isinstance before use
  • timeout=20 on each HTTP call; pagination caps at 100/page
  • No sensitive data in logs
  • No emojis or non-ASCII in console output

Compatibility

  • Path.as_posix() used for cross-platform directory strings in _note_write_target

Issues

1. Test isolation: test_collect_merged_pull_request_context may make real GitHub API calls in CI

tests/ci/test_project_updates.py:78

The test invokes collect_project_update_context with a payload containing repo="basicmachines-co/basic-memory" and pr_number=123. When GITHUB_TOKEN is present in the environment (which GitHub Actions sets automatically for every workflow run), _enrich_pull_request_context will make live API calls to the GitHub API. The test asserts nothing about changed_files or commits, so it may silently pass with unpredictable network data, or fail due to rate limits or transient errors.

Fix: add monkeypatch.delenv("GITHUB_TOKEN", raising=False) to the test fixture, or pass monkeypatch and unset the env var at the top of the test.

Fix this →

2. Missing coverage for _github_api_get error paths

src/basic_memory/ci/project_updates.py:274-294

The project requires 100% test coverage. The error branches in _github_api_geturllib.error.HTTPError, urllib.error.URLError, and the non-list/dict response guard — have no test coverage. Similarly, _github_api_get_list's non-list-response path (line 304) is untested.

These branches can be covered by monkeypatching urllib.request.urlopen or _github_api_get directly in the test suite.

Fix this →

3. Missing coverage for helper None-input paths

src/basic_memory/ci/project_updates.py:328-368

The _changed_file_from_github, _commit_summary_from_github, and _issue_detail_from_github functions each have an early-return None path for malformed input (e.g. missing filename, missing commit dict, missing number). These paths are not exercised by the current test suite. Add short tests passing None, {}, or structurally invalid dicts to cover them.

Fix this →

4. raising=False in monkeypatch is unnecessary and masks rename mistakes

tests/ci/test_project_updates.py:135

monkeypatch.setattr(project_updates, "_github_api_get", fake_github_api_get, raising=False)

raising=False means if _github_api_get is ever renamed or moved, the monkeypatch silently does nothing and the test runs against the real GitHub API. The default raising=True is the safer choice here since the attribute is known to exist.

Fix this →


Minor Observations (non-blocking)

  • render_workflow uses actions/checkout@v6 and actions/setup-python@v6 — these appear to be pre-existing placeholders for future action versions and are not changed by this PR, so no action needed here.
  • No rate-limit handling in _github_api_get_list — a 429 from GitHub will raise ValueError and fail CI. For most repos this is fine; the README correctly documents fail-fast as the intended behavior.
  • Pagination termination — the len(payload) < 100 sentinel is correct; GitHub returns an empty array when exhausted, which terminates the loop before any infinite-loop risk.

The core implementation — GitHub API enrichment, required narrative fields, and updated section headings — is solid and well-tested for the happy path. The issues flagged above are primarily around test isolation and 100% coverage enforcement. Would recommend addressing issues 1–3 before merge.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d47c8c1647

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

project=project,
project_id=project_id,
workspace=workspace,
refresh=refresh_schemas,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Let schema refresh run in installed repos

When --refresh-schemas is used in a repo that already has Auto BM installed, execution never reaches this refreshed seed call because _write_generated_files(...) above still rejects the existing workflow/config/prompt unless --force is also passed. That makes the new refresh/update flags unusable for their documented upgrade path of refreshing schema notes without overwriting generated files; skip or relax the generated-file write when only refreshing schemas.

Useful? React with 👍 / 👎.

Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Claude finished @phernandez's task in 2m 25s —— View job


PR Review: fix(ci): improve auto bm note narrative

Tasks:

  • Gather context and understand PR scope
  • Read changed files
  • Review code quality and standards
  • Check testing coverage
  • Check architecture patterns
  • Check security and performance
  • Post final review summary

Summary

Well-scoped PR that enriches the Auto BM context with changed files, commits, and linked issue details from the GitHub API; makes narrative fields required on AgentSynthesis; renames ## Why It Matters## Project Memory; and updates the checked-in prompt/schema. Core implementation is solid. Three coverage gaps must be addressed before merge per the 100%-coverage rule.


Team Checklist

Code Quality & Standards

  • Follows Basic Memory coding conventions in CLAUDE.md — section headers, fail-fast, no speculative getattr
  • Python 3.12+ type annotations — ConfigDict, field_validator, model_copy all correct
  • SQLAlchemy 2.0 — not applicable (CI module only)
  • FastAPI and Typer conventions followed
  • 100-character line length maintained
  • No commented-out code blocks

Testing & Documentation

  • Unit tests for new helpers (_changed_file_from_github, _commit_summary_from_github, etc. via enrichment path)
  • Integration tests — not applicable; CI-only module, unit tests are appropriate
  • [~] Edge case coverage — see Issues below
  • [~] 100% test coverage — three gaps identified
  • Documentation updated (README.md, memory-ci-capture.md)
  • CLAUDE.md — no convention changes

Basic Memory Architecture

  • Atomic, composable design maintained
  • No database changes; no migrations needed
  • Local-first architecture preserved; enrichment is opportunistic (no token → no-op)
  • model_copy(update={...}) keeps ProjectUpdateContext effectively immutable

Security & Performance

  • GITHUB_TOKEN read from env only — never logged or hardcoded
  • timeout=20 on all HTTP calls
  • Pagination correct: len(payload) < 100 sentinel terminates cleanly at the last page
  • _body_excerpt caps issue bodies at 2000 chars before they reach the agent prompt
  • ConfigDict(extra="ignore") on AgentSynthesis — right defensive call for structured-output consumers

Compatibility

  • Path(file_path).parent.as_posix() used in _note_write_target — forward slashes on Windows
  • No emojis or non-ASCII in console/log output

Issues

1. Test isolation: GITHUB_TOKEN not unset in test_collect_merged_pull_request_context

tests/ci/test_project_updates.py:78

GitHub Actions sets GITHUB_TOKEN automatically in every workflow run. Without unsetting it, this test calls _enrich_pull_request_context, which tries to fetch live data for PR #123 in basicmachines-co/basic-memory. If that PR doesn't exist the API returns 404, ValueError propagates, and the test fails in CI with a confusing error.

Fix: add monkeypatch parameter and call monkeypatch.delenv("GITHUB_TOKEN", raising=False) at the top of the test body.

Fix this →

2. raising=False in monkeypatch masks future renames

tests/ci/test_project_updates.py:135

monkeypatch.setattr(project_updates, "_github_api_get", fake_github_api_get, raising=False)

raising=False means that if _github_api_get is ever renamed or moved, the patch silently does nothing and the test runs against the live GitHub API. The default raising=True is the right choice — the attribute is known to exist and should fail explicitly if it ever disappears.

Fix this →

3. Coverage gaps for _github_api_get error paths and helper None-return paths

src/basic_memory/ci/project_updates.py:287-293, :303-304, :328-368

The project requires 100% test coverage. The following branches are untested:

  • _github_api_geturllib.error.HTTPError handler (line 287), urllib.error.URLError handler (line 291), and non-list/dict response guard (line 292)
  • _github_api_get_list — non-list response path (line 304)
  • _changed_file_from_github(None) and _changed_file_from_github({}) — returns None
  • _commit_summary_from_github(None) — returns None; _commit_summary_from_github({"sha": "x"}) (missing commit key) — returns None
  • _issue_detail_from_github(None) and _issue_detail_from_github({}) — returns None
  • _body_excerpt("x" * 2001) — truncation branch (len(stripped) > limit)

All of these can be covered with short, direct unit tests using monkeypatching or by calling the private helpers directly.

Fix this →


Minor Observations (non-blocking)

why_it_matters renders as ## Project Memory

AgentSynthesis.why_it_matters (line 117) renders as ## Project Memory in build_project_update_note (line 585). The intent is clear from reading both places, but a one-line comment on either the field or the section entry would confirm to a future reader that this is intentional rather than an accidental half-rename.

linked_issue_details enriched but not rendered in the note

_enrich_pull_request_context populates context.linked_issue_details with fetched issue titles, bodies, and states, but build_project_update_note never renders that list into the published note. The prompt instructs Codex to use them as synthesis evidence, so they influence the output — but a comment near the source-links block explaining that choice (or omission) would help future contributors.


Response to Codex Inline Review Comment

The Codex inline comment on ci.py:113 claims that --refresh-schemas is unusable when files already exist because _write_generated_files "rejects" them. This is incorrect. When preserve_existing=True and any target exists, _write_generated_files returns False — it does not raise. Execution continues straight to seed_project_update_schemas, which runs unconditionally. The test test_setup_refreshes_schema_notes_when_generated_files_already_exist (line 143) verifies exactly this: existing files are preserved AND schema seeding is called. No action needed.


Overall: Core implementation is correct and well-tested for the happy path. Three items above must be fixed before merge — test isolation (#1) and missing coverage (#3) per the 100%-coverage rule, and the raising=False footgun (#2). The Codex inline comment is a false positive.

Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Claude finished @phernandez's task in 2m 25s —— View job


PR Review: fix(ci): improve auto bm note narrative

Tasks:

  • Gather context and understand PR scope
  • Read changed files
  • Review code quality and standards
  • Check testing coverage
  • Check architecture patterns
  • Check security and performance
  • Post final review summary

Summary

Solid, well-scoped PR. The GitHub API enrichment layer (changed files, commits, linked issue details), required narrative fields on AgentSynthesis, the ## Why It Matters## Project Memory rename, and the updated prompt/schema contract are all coherent and follow project conventions. Three issues need attention before merge: a test isolation bug that will cause flaky CI failures, raising=False on two monkeypatches that will silently break if the patched symbol is renamed, and coverage gaps required by the 100%-coverage rule.


Team Checklist

Code Quality & Standards

  • Follows Basic Memory coding conventions — section-header comments, fail-fast, no speculative getattr
  • Python 3.12+ type annotations — ConfigDict, field_validator, model_copy all correct
  • SQLAlchemy 2.0 — not applicable (CI module only)
  • FastAPI and Typer conventions followed
  • 100-character line length maintained
  • No commented-out code blocks

Testing & Documentation

  • Unit tests for new helpers
  • Integration tests — not applicable; CI-only module
  • [~] Edge case coverage — see issues below
  • [~] 100% test coverage — three gaps remain
  • Documentation updated (README.md, memory-ci-capture.md)
  • CLAUDE.md — no convention changes needed

Basic Memory Architecture

  • Atomic, composable design maintained
  • No database changes; no migrations needed
  • Local-first architecture preserved; enrichment is opportunistic (no token → no-op)
  • model_copy(update={...}) keeps ProjectUpdateContext effectively immutable
  • Auth boundary correct — GITHUB_TOKEN in collect step, BASIC_MEMORY_API_KEY in publish step

Security & Performance

  • GITHUB_TOKEN read from env only — never logged or hardcoded
  • timeout=20 on all HTTP calls
  • Pagination terminates correctly (len(payload) < 100 sentinel)
  • _body_excerpt caps issue bodies at 2000 chars before agent prompt
  • ConfigDict(extra="ignore") — right defensive call for structured-output consumers

Compatibility

  • Path(file_path).parent.as_posix() in _note_write_target — forward slashes on Windows
  • No emojis or non-ASCII in console/log output

Issues

1. Test isolation: GITHUB_TOKEN not unset in test_collect_merged_pull_request_context

tests/ci/test_project_updates.py:78

GitHub Actions sets GITHUB_TOKEN automatically in every workflow run. Without unsetting it, this test will call _enrich_pull_request_context with a real token, which tries to fetch live data for PR #123 in basicmachines-co/basic-memory. If that PR doesn't exist (or if rate limits are hit), ValueError propagates and the test fails with a confusing error.

Fix: add a monkeypatch parameter and call monkeypatch.delenv("GITHUB_TOKEN", raising=False) at the start of the test body.

Fix this →

2. raising=False in two monkeypatches masks future renames

tests/ci/test_project_updates.py:135 and :166

Both test_collect_enriches_pull_request_context_from_github_api and test_github_api_get_list_fetches_multiple_pages use raising=False:

monkeypatch.setattr(project_updates, "_github_api_get", fake_github_api_get, raising=False)

If _github_api_get is ever renamed or moved, the patch silently does nothing and both tests run against the live GitHub API. The default raising=True is the correct choice — the attribute is known to exist and should fail explicitly if it disappears.

Fix this →

3. Coverage gaps for error paths and helper None-return paths

src/basic_memory/ci/project_updates.py:287-293, :303-304, :319-325, :328-368

The project requires 100% test coverage. The following branches are untested:

  • _github_api_geturllib.error.HTTPError handler (line 287), urllib.error.URLError handler (line 290), non-list/dict response guard (line 292)
  • _github_api_get_list — non-list response path (line 303)
  • _body_excerpt — truncation branch (len(stripped) > limit, line 324)
  • _changed_file_from_github(None) and _changed_file_from_github({}) — early-return None path
  • _commit_summary_from_github(None) and _commit_summary_from_github({"sha": "x"}) (no commit key) — early-return None paths
  • _issue_detail_from_github(None) and _issue_detail_from_github({}) — early-return None path

All are coverable with short direct unit tests, either by monkeypatching urllib.request.urlopen or calling the private helpers directly via project_updates._helper(...).

Fix this →


Minor Observations (non-blocking)

why_it_matters renders as ## Project Memory

AgentSynthesis.why_it_matters (line 117) renders as ## Project Memory in build_project_update_note (line 585). Intent is clear from reading both, but a one-line comment on the field (e.g. # rendered as "## Project Memory" heading in the note) would confirm to a future reader this is intentional, not an accidental half-rename.

linked_issue_details enriched but not rendered in note body

_enrich_pull_request_context populates context.linked_issue_details with issue titles, bodies, and states, but build_project_update_note never renders them verbatim — they surface only indirectly via _linked_issue_source_links. The prompt instructs Codex to synthesize from them, so this is likely intentional. A short comment near the source-links block would make that clear to future contributors.

Response to Codex inline review comment on ci.py:113

The Codex comment claims --refresh-schemas is unusable when files exist because _write_generated_files "rejects" them. This is incorrect. When preserve_existing=True, _write_generated_files returns False early without raising. Execution always reaches seed_project_update_schemas unconditionally. The test test_setup_refreshes_schema_notes_when_generated_files_already_exist (line 143) verifies exactly this — existing files are preserved AND schema seeding runs. No action needed.


Overall: The core implementation is correct and internally consistent. Issues #1 and #3 must be resolved before merge per the CI-stability and 100%-coverage rules. Issue #2 is a low-risk footgun that is straightforward to fix at the same time.

Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Claude finished @phernandez's task in 1m 57s —— View job


PR Review: fix(ci): improve auto bm note narrative

Tasks:

  • Gather context and understand the request
  • Read changed files
  • Review against team checklist
  • Post final review

Summary

Solid enrichment of the Auto BM CI layer. The GitHub API enrichment (changed files, commits, linked issue details), required narrative fields on AgentSynthesis, the ## Why It Matters## Project Memory section rename, and the updated prompt/schema contract are all coherent and well-implemented. Three issues carried over from prior reviews remain unfixed.


Team Checklist

Code Quality & Standards

  • Follows Basic Memory coding conventions — section-header comments, fail-fast, no speculative getattr
  • Python 3.12+ type annotations — ConfigDict, field_validator, model_copy used correctly
  • SQLAlchemy 2.0 — not applicable (CI module)
  • FastAPI and Typer conventions followed
  • 100-character line length maintained
  • No commented-out code blocks

Testing & Documentation

  • Unit tests for new helpers
  • Integration tests — not applicable; CI-only module
  • [~] Edge case coverage — three gaps remain (see Issues)
  • [~] 100% test coverage — three known gaps unaddressed
  • Documentation updated (README.md, memory-ci-capture.md)
  • CLAUDE.md — no convention changes

Basic Memory Architecture

  • Atomic, composable design maintained
  • No database changes; no migrations needed
  • Local-first architecture preserved; enrichment is opportunistic (no token → no-op)
  • model_copy(update={...}) keeps ProjectUpdateContext effectively immutable
  • Auth boundary correct — GITHUB_TOKEN in collect step, BASIC_MEMORY_API_KEY in publish step

Security & Performance

  • GITHUB_TOKEN read from env only — never logged or hardcoded
  • timeout=20 on all HTTP calls
  • Pagination correct — len(payload) < 100 sentinel terminates cleanly
  • _body_excerpt caps issue bodies at 2000 chars before agent prompt
  • ConfigDict(extra="ignore") — right defensive call for structured-output consumers

Compatibility

  • Path(file_path).parent.as_posix() in _note_write_target — forward slashes on Windows
  • No emojis or non-ASCII in console/log output

Issues

1. Test isolation: GITHUB_TOKEN not unset in test_collect_merged_pull_request_context

tests/ci/test_project_updates.py:78

GitHub Actions sets GITHUB_TOKEN automatically in every workflow run. Without unsetting it, this test calls _enrich_pull_request_context with the real token, which tries to fetch PR #123 from basicmachines-co/basic-memory. If that PR doesn't exist (or rate limits are hit), ValueError propagates and the test fails with a confusing error in CI.

Fix: add a monkeypatch parameter and call monkeypatch.delenv("GITHUB_TOKEN", raising=False) at the start of the test body.

Fix this →

2. raising=False in two monkeypatches masks future renames

tests/ci/test_project_updates.py:135 and :166

Both test_collect_enriches_pull_request_context_from_github_api and test_github_api_get_list_fetches_multiple_pages use raising=False:

monkeypatch.setattr(project_updates, "_github_api_get", fake_github_api_get, raising=False)

If _github_api_get is ever renamed or moved, the patch silently does nothing and both tests call the live GitHub API. Remove raising=False from both to restore the default raising=True — the attribute is known to exist and should fail loudly if it disappears.

Fix this →

3. Coverage gaps for error paths and helper None-return paths

src/basic_memory/ci/project_updates.py:287-293, :303-304, :319-325, :328-368

The project requires 100% test coverage. The following branches are untested:

  • _github_api_geturllib.error.HTTPError handler (line 287), urllib.error.URLError handler (line 290), non-list/dict response guard (line 292)
  • _github_api_get_list — non-list response path (line 303)
  • _body_excerpt — truncation branch at line 324 (len(stripped) > limit)
  • _changed_file_from_github(None) and _changed_file_from_github({}) — early-return None path
  • _commit_summary_from_github(None) and _commit_summary_from_github({"sha": "x"}) (no commit key) — early-return None paths
  • _issue_detail_from_github(None) and _issue_detail_from_github({}) — early-return None path

All are coverable with short direct unit tests — either monkeypatching urllib.request.urlopen for the network paths, or calling the private helpers directly via project_updates._helper(...).

Fix this →


Positive Observations

  • The # Keep the structured-output field stable while using the clearer note heading. comment at project_updates.py:585 directly addresses the why_it_matters## Project Memory mismatch flagged in prior reviews. ✅
  • _required_text validator on the six narrative fields is a solid guardrail — catches empty-string Codex output at parse time.
  • ConfigDict(extra="ignore") on AgentSynthesis correctly handles legacy/extra Codex fields without breaking publish.
  • The Codex inline comment claiming --refresh-schemas is "unusable when files already exist" is incorrect. _write_generated_files with preserve_existing=True returns False early without raising; seed_project_update_schemas runs unconditionally on line 108. test_setup_refreshes_schema_notes_when_generated_files_already_exist verifies exactly this. No action needed.

Overall: The core implementation is correct. Issues #1 (flaky CI) and #3 (100% coverage) should be fixed before merge. Issue #2 is a low-risk footgun — straightforward to fix alongside #3.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0bbf32b4ec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +401 to +402
detail = _issue_detail_from_github(
_github_api_get(f"/repos/{context.repo}/issues/{number}", token)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat missing linked issue details as best-effort

Because the generated workflow now always sets GITHUB_TOKEN, every eligible PR body/title containing a #number reaches this lookup. If that number is not an issue in the same repo (for example text like attempt #2, an RFC number, or a deleted issue), GitHub returns 404 and _github_api_get raises ValueError, causing bm ci collect to fail before any project update is published. The issue-detail enrichment should skip unavailable issue details rather than making the whole collection step fail.

Useful? React with 👍 / 👎.

Comment on lines 274 to 278
await mcp_write_note(
title=spec.title,
title=title,
content=spec.content,
directory="schemas",
directory=directory,
project=routed_project,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve the schema file name when refreshing

When --refresh-schemas finds an existing schema note whose filename does not match its title, this call still passes only the existing title and parent directory to write_note. Since write_note derives the target path from title + directory, overwrite=True will create/update custom/schemas/<title>.md instead of the matched file_path (for example custom/schemas/update.md), so refresh can duplicate the schema note rather than updating the installed one. Use the matched path/permalink, or otherwise derive the write target from the existing filename, when refreshing.

Useful? React with 👍 / 👎.

Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Claude finished @phernandez's task in 2m 12s —— View job


PR Review: fix(ci): improve auto bm note narrative

Tasks:

  • Gather context and understand PR scope
  • Read changed files
  • Review code quality and standards
  • Check testing coverage
  • Check architecture patterns
  • Check security and performance
  • Post final review summary

Summary

Well-scoped PR. The GitHub API enrichment (changed files, commits, linked issue details), required narrative fields on AgentSynthesis, the ## Why It Matters## Project Memory rename, and the updated prompt/schema contract are all coherent and internally consistent. Core implementation is clean and follows project conventions. Four issues need attention before merge — one functional correctness issue that other reviews have missed, two test gaps required by the 100%-coverage rule, and a test isolation bug that will cause flaky CI failures.


Team Checklist

Code Quality & Standards

  • Follows Basic Memory coding conventions in CLAUDE.md — section headers, fail-fast, no speculative getattr
  • Python 3.12+ type annotations — ConfigDict, field_validator, model_copy all correct
  • SQLAlchemy 2.0 — not applicable (CI module only)
  • FastAPI and Typer conventions followed
  • 100-character line length maintained
  • No commented-out code blocks

Testing & Documentation

  • Unit tests for new helpers
  • Integration tests — not applicable; CI-only module
  • [~] Edge case coverage — four issues below
  • [~] 100% test coverage — coverage gaps remain
  • Documentation updated (README.md, memory-ci-capture.md, SOUL.md)
  • CLAUDE.md — no convention changes needed

Basic Memory Architecture

  • Atomic, composable design maintained
  • No database changes; no migrations needed
  • Local-first architecture preserved; enrichment is opportunistic (no token → no-op)
  • model_copy(update={...}) keeps ProjectUpdateContext effectively immutable
  • Auth boundary correct — GITHUB_TOKEN in collect step, BASIC_MEMORY_API_KEY in publish step

Security & Performance

  • GITHUB_TOKEN read from env only — never logged or hardcoded
  • timeout=20 on all HTTP calls
  • Pagination terminates correctly — len(payload) < 100 sentinel
  • _body_excerpt caps issue bodies at 2000 chars before agent prompt
  • ConfigDict(extra="ignore") — right defensive call for structured-output consumers

Compatibility

  • Path(file_path).parent.as_posix() in _note_write_target — forward slashes on Windows
  • No emojis or non-ASCII in console/log output

Issues

1. Functional: linked issue 404 can silently fail the CI collect step

src/basic_memory/ci/project_updates.py:398-406

_linked_issues() extracts #number patterns from both the PR title and body using a regex. This matches any #digit sequence — including text like "attempt #2", "release #5", or any issue number that happens to be closed/deleted/in-a-different-repo. When GITHUB_TOKEN is present, _enrich_pull_request_context calls _github_api_get(f"/repos/{context.repo}/issues/{number}", token) for every matched reference. If that issue doesn't exist (GitHub returns 404), _github_api_get raises ValueError, which propagates through _collect_pull_request_contextcollect_project_update_contextcollect CLI command → exit code 1, blocking the CI step before any note is published.

This is a genuine production breakage risk: a merged PR with a body containing "step #3 of the plan" or a cross-repo issue reference would fail the entire Auto BM workflow. The linked issue enrichment should be best-effort — catch ValueError per issue and skip unfetchable ones rather than aborting the whole context collection.

# Current — fails the whole step if any issue is 404
detail = _issue_detail_from_github(
    _github_api_get(f"/repos/{context.repo}/issues/{number}", token)
)

# Better — skip issues that can't be fetched
try:
    raw = _github_api_get(f"/repos/{context.repo}/issues/{number}", token)
    detail = _issue_detail_from_github(raw)
except ValueError:
    detail = None

Fix this →

2. Test isolation: GITHUB_TOKEN not unset in test_collect_merged_pull_request_context

tests/ci/test_project_updates.py:79

GitHub Actions sets GITHUB_TOKEN automatically in every workflow run. Without unsetting it, this test calls _enrich_pull_request_context with the real token, which tries to fetch live data for PR #123 in basicmachines-co/basic-memory. This introduces network dependency and non-determinism in a unit test — it should be isolated.

Fix: add monkeypatch: pytest.MonkeyPatch parameter and call monkeypatch.delenv("GITHUB_TOKEN", raising=False) at the start of the test body.

Fix this →

3. Coverage gaps for error paths and helper None-return paths

src/basic_memory/ci/project_updates.py:287-293, :303-304, :319-325, :329-368

The project requires 100% test coverage. The following branches are untested:

  • _github_api_geturllib.error.HTTPError handler (line 288), urllib.error.URLError handler (line 291), non-list/dict response guard (line 293)
  • _github_api_get_list — non-list response path (line 304)
  • _body_excerpt — truncation branch at line 325 (len(stripped) > limit)
  • _changed_file_from_github(None) and _changed_file_from_github({}) — early-return None path
  • _commit_summary_from_github(None) and _commit_summary_from_github({"sha": "x"}) (no commit key) — early-return None
  • _issue_detail_from_github(None) and _issue_detail_from_github({}) — early-return None

All are coverable with short direct unit tests — monkeypatch urllib.request.urlopen for the network paths, call private helpers directly via project_updates._helper(...) for the rest.

Fix this →

4. raising=False in two monkeypatches masks future renames

tests/ci/test_project_updates.py:136 and :167

Both test_collect_enriches_pull_request_context_from_github_api and test_github_api_get_list_fetches_multiple_pages use raising=False:

monkeypatch.setattr(project_updates, "_github_api_get", fake_github_api_get, raising=False)

If _github_api_get is ever renamed or moved, the patch silently does nothing and both tests call the live GitHub API. Remove raising=False from both to restore the default raising=True.

Fix this →


Positive Observations

  • The # Keep the structured-output field stable while using the clearer note heading. comment at project_updates.py:586 directly addresses the why_it_matters## Project Memory mismatch. ✅
  • _required_text validator on six narrative fields is a solid guardrail — catches empty-string structured output at parse time rather than publishing a hollow note.
  • ConfigDict(extra="ignore") on AgentSynthesis correctly handles legacy/extra fields from older schemas without breaking publish.
  • model_copy(update={...}) in _enrich_pull_request_context keeps ProjectUpdateContext effectively immutable — the passed-in context is never mutated.
  • changed_files_count is correctly preserved from the PR payload even when GITHUB_TOKEN is absent — the agent can still reference it even when the per-file list is unavailable.
  • SOUL.md template is a clean addition — voice guide stays separate from schema contract, can't override source facts.

Response to Codex Inline Comment on ci.py:113

The Codex comment claiming --refresh-schemas is "unusable when files already exist" is incorrect. In _write_generated_files, when preserve_existing=True and any target exists, the function returns False early without raising. Execution always continues to seed_project_update_schemas unconditionally on line 110. The test test_setup_refreshes_schema_notes_when_generated_files_already_exist verifies exactly this — existing files are preserved AND schema seeding runs. No action needed.

The Codex comment at project_updates.py:403 about linked issue 404 is valid and is covered by Issue #1 above.


Overall: The core implementation is correct and well-tested for the happy path. Issue #1 (linked issue 404 fail-fast) is a functional correctness gap that prior reviews missed and should be fixed before merge. Issues #2#4 (test isolation, coverage gaps, raising=False footgun) are the same items flagged in prior reviews and remain unaddressed.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b828aaf055

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

files = {
repo_root / DEFAULT_WORKFLOW_PATH: render_workflow(config),
repo_root / DEFAULT_PROMPT_PATH: render_capture_prompt(),
repo_root / DEFAULT_SOUL_PATH: render_soul_template(),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Seed the new soul guide without forcing overwrites

When upgrading a repo that already has the previous Auto BM workflow/config/prompt but does not yet have SOUL.md, adding the soul guide to this all-or-nothing generated-file set leaves no non-destructive setup path: normal setup aborts on the existing files, --refresh-schemas returns before writing the missing soul file, and --force rewrites any customized workflow/config/prompt. Please treat SOUL.md as a missing-only seed, or write missing generated files before rejecting existing ones.

Useful? React with 👍 / 👎.

@phernandez phernandez merged commit 8887267 into main Jun 5, 2026
24 of 27 checks passed
@phernandez phernandez deleted the codex/improve-auto-bm-note-narrative branch June 5, 2026 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant